More Algorithms for Provable Dictionary Learning
نویسندگان
چکیده
In dictionary learning, also known as sparse coding, the algorithm is given samples of the form y = Ax where x ∈ Rm is an unknown random sparse vector and A is an unknown dictionary matrix in R (usually m > n, which is the overcomplete case). The goal is to learn A and x. This problem has been studied in neuroscience, machine learning, visions, and image processing. In practice it is solved by heuristic algorithms and provable algorithms seemed hard to find. Recently, provable algorithms were found that work if the unknown feature vector x is √ n-sparse or even sparser. Spielman et al. [SWW12] did this for dictionaries where m = n; Arora et al. [AGM13] gave an algorithm for overcomplete (m > n) and incoherent matrices A; and Agarwal et al. [AAN13] handled a similar case but with weaker guarantees. This raised the problem of designing provable algorithms that allow sparsity ≫ √n in the hidden vector x. The current paper designs algorithms that allow sparsity up to n/poly(logn). It works for a class of matrices where features are individually recoverable, a new notion identified in this paper that may motivate further work. The algorithm runs in quasipolynomial time because they use limited enumeration. Princeton University, Computer Science Department and Center for Computational Intractability. Email: [email protected]. This work is supported by the NSF grants CCF-0832797, CCF-1117309, CCF-1302518, DMS-1317308, and Simons Investigator Grant. Google Research NYC. Email: [email protected]. Part of this work was done while the author was a Postdoc at EPFL, Switzerland. Microsoft Research. Email: [email protected]. Part of this work was done while the author was a graduate student at Princeton University and was supported in part by NSF grants CCF-0832797, CCF1117309, CCF-1302518, DMS-1317308, and Simons Investigator Grant. Princeton University, Computer Science Department and Center for Computational Intractability. Email: [email protected]. This work is supported by the NSF grants CCF-0832797, CCF-1117309, CCF-1302518, DMS-1317308, and Simons Investigator Grant.
منابع مشابه
A Provable Approach for Double-Sparse Coding
Sparse coding is a crucial subroutine in algorithms for various signal processing, deep learning, and other machine learning applications. The central goal is to learn an overcomplete dictionary that can sparsely represent a given dataset. However, storage, transmission, and processing of the learned dictionary can be untenably high if the data dimension is high. In this paper, we consider the ...
متن کاملApproximation and learning by greedy algorithms
We consider the problem of approximating a given element f from a Hilbert space H by means of greedy algorithms and the application of such procedures to the regression problem in statistical learning theory. We improve on the existing theory of convergence rates for both the orthogonal greedy algorithm and the relaxed greedy algorithm, as well as for the forward stepwise projection algorithm. ...
متن کاملProvable Dictionary Learning via Column Signatures
In dictionary learning, also known as sparse coding, we are given samples of the form y = Ax where x ∈ R is an unknown random sparse vector and A is an unknown dictionary matrix in Rn×m (usually m > n, which is the overcomplete case). The goal is to learn A and x. This problem has been studied in neuroscience, machine learning, vision, and image processing. In practice it is solved by heuristic...
متن کاملProvable Algorithms for Machine Learning Problems
Modern machine learning algorithms can extract useful information from text, images and videos. All these applications involve solving NP-hard problems in average case using heuristics. What properties of the input allow it to be solved efficiently? Theoretically analyzing the heuristics is often very challenging. Few results were known. This thesis takes a different approach: we identify natur...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1401.0579 شماره
صفحات -
تاریخ انتشار 2014